Goto

Collaborating Authors

 Central Serbia



Constraint- and Score-Based Nonlinear Granger Causality Discovery with Kernels

Murphy, Fiona, Benavoli, Alessio

arXiv.org Machine Learning

Granger causality (GC) [15] is a time series causal discovery framework that uses predictive modeling to identify the underlying causal structure of a time series system. Relying on the assumption that cause precedes effect, GC assesses whether including the lagged information from one time series in the autoregressive model of a second time series enhances its predictions. This improvement indicates a predictive relationship between the time series variables, where one time series provides supplemental information about the future of another time series, thereby signifying the presence of a (Granger) causal relationship. GC requires only observational data, and has been used for time series causal discovery across diverse domains, including climate science [33], political and social sciences [17], econometrics [4], and biological systems studies [13]. The original formulation of GC requires several assumptions to be satisfied for causal identifiability. In regards to the candidate time series system, it is assumed that the time series variables are stationary, and that all variables are observed (absence of latent confounders). GC was initially proposed for bivariate time series systems, but was generalised for the multivariate setting to accommodate the assumption that all relevant variables are included in the analysis [15]. Additional assumptions are made with regard to the types of causal relationships that can be identified within the time series system. GC cannot estimate a causal relationship between time series at an instantaneous time point, relying on the relationship between the lags and predicted values to determine a GC relationship.


How AI Companies Got Caught Up in US Military Efforts

WIRED

Two years ago, companies like Meta and OpenAI were united against military use of their tools. Now all of that has changed. At the start of 2024, Anthropic, Google, Meta, and OpenAI were united against military use of their AI tools. But over the next 12 months, something changed. In January, OpenAI quietly rescinded its ban on using AI for "military and warfare" purposes, and soon after it was reported to be working on "a number of projects" with the Pentagon. In November, in the same week that Donald Trump was reelected US president, Meta announced that the United States and select allies would be able to employ Llama for defense uses.


Physics-Informed Inductive Biases for Voltage Prediction in Distribution Grids

Okoyomon, Ehimare, Yaniv, Arbel, Goebel, Christoph

arXiv.org Artificial Intelligence

Voltage prediction in distribution grids is a critical yet difficult task for maintaining power system stability. Machine learning approaches, particularly Graph Neural Networks (GNNs), offer significant speedups but suffer from poor generalization when trained on limited or incomplete data. In this work, we systematically investigate the role of inductive biases in improving a model's ability to reliably learn power flow. Specifically, we evaluate three physics-informed strategies: (i) power-flow-constrained loss functions, (ii) complex-valued neural networks, and (iii) residual-based task reformulation. Using the ENGAGE dataset, which spans multiple low- and medium-voltage grid configurations, we conduct controlled experiments to isolate the effect of each inductive bias and assess both standard predictive performance and out-of-distribution generalization. Our study provides practical insights into which model assumptions most effectively guide learning for reliable and efficient voltage prediction in modern distribution networks.


Fine-Tuning BERT for Domain-Specific Question Answering: Toward Educational NLP Resources at University Scale

Montfrond, Aurélie

arXiv.org Artificial Intelligence

Prior work on scientific question answering has largely emphasized chatbot-style systems, with limited exploration of fine-tuning foundation models for domain-specific reasoning. In this study, we developed a chatbot for the University of Limerick's Department of Electronic and Computer Engineering to provide course information to students. A custom dataset of 1,203 question-answer pairs in SQuAD format was constructed using the university book of modules, supplemented with manually and synthetically generated entries. We fine-tuned BERT (Devlin et al., 2019) using PyTorch and evaluated performance with Exact Match and F1 scores. Results show that even modest fine-tuning improves hypothesis framing and knowledge extraction, demonstrating the feasibility of adapting foundation models to educational domains. While domain-specific BERT variants such as BioBERT and SciBERT exist for biomedical and scientific literature, no foundation model has yet been tailored to university course materials. Our work addresses this gap by showing that fine-tuning BERT with academic QA pairs yields effective results, highlighting the potential to scale towards the first domain-specific QA model for universities and enabling autonomous educational knowledge systems.


hls4ml: A Flexible, Open-Source Platform for Deep Learning Acceleration on Reconfigurable Hardware

Schulte, Jan-Frederik, Ramhorst, Benjamin, Sun, Chang, Mitrevski, Jovan, Ghielmetti, Nicolò, Lupi, Enrico, Danopoulos, Dimitrios, Loncar, Vladimir, Duarte, Javier, Burnette, David, Laatu, Lauri, Tzelepis, Stylianos, Axiotis, Konstantinos, Berthet, Quentin, Wang, Haoyan, White, Paul, Demirsoy, Suleyman, Colombo, Marco, Aarrestad, Thea, Summers, Sioni, Pierini, Maurizio, Di Guglielmo, Giuseppe, Ngadiuba, Jennifer, Campos, Javier, Hawks, Ben, Gandrakota, Abhijith, Fahim, Farah, Tran, Nhan, Constantinides, George, Que, Zhiqiang, Luk, Wayne, Tapper, Alexander, Hoang, Duc, Paladino, Noah, Harris, Philip, Lai, Bo-Cheng, Valentin, Manuel, Forelli, Ryan, Ogrenci, Seda, Gerlach, Lino, Flynn, Rian, Liu, Mia, Diaz, Daniel, Khoda, Elham, Quinnan, Melissa, Solares, Russell, Parajuli, Santosh, Neubauer, Mark, Herwig, Christian, Tsoi, Ho Fung, Rankin, Dylan, Hsu, Shih-Chieh, Hauck, Scott

arXiv.org Artificial Intelligence

We present hls4ml, a free and open-source platform that translates machine learning (ML) models from modern deep learning frameworks into high-level synthesis (HLS) code that can be integrated into full designs for field-programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs). With its flexible and modular design, hls4ml supports a large number of deep learning frameworks and can target HLS compilers from several vendors, including Vitis HLS, Intel oneAPI and Catapult HLS. Together with a wider eco-system for software-hardware co-design, hls4ml has enabled the acceleration of ML inference in a wide range of commercial and scientific applications where low latency, resource usage, and power consumption are critical. In this paper, we describe the structure and functionality of the hls4ml platform. The overarching design considerations for the generated HLS code are discussed, together with selected performance results.


Communication-Efficient Learning for Satellite Constellations

Tudose, Ruxandra-Stefania, Grüss, Moritz H. W., Kim, Grace Ra, Johansson, Karl H., Bastianello, Nicola

arXiv.org Artificial Intelligence

Satellite constellations in low-Earth orbit are now widespread, enabling positioning, Earth imaging, and communications. In this paper we address the solution of learning problems using these satellite constellations. In particular, we focus on a federated approach, where satellites collect and locally process data, with the ground station aggregating local models. We focus on designing a novel, communication-efficient algorithm that still yields accurate trained models. To this end, we employ several mechanisms to reduce the number of communications with the ground station (local training) and their size (compression). We then propose an error feedback mechanism that enhances accuracy, which yields, as a byproduct, an algorithm-agnostic error feedback scheme that can be more broadly applied. We analyze the convergence of the resulting algorithm, and compare it with the state of the art through simulations in a realistic space scenario, showcasing superior performance.


Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

Li, Yunxin, Chen, Xinyu, Jiang, Shenyuan, Shi, Haoyuan, Liu, Zhenyu, Zhang, Xuanyu, Deng, Nanhao, Xu, Zhenran, Ma, Yicheng, Zhang, Meishan, Hu, Baotian, Zhang, Min

arXiv.org Artificial Intelligence

We present Uni-MoE 2.0 from the Lychee family. As a fully open-source omnimodal large model (OLM), it substantially advances Lychee's Uni-MoE series in language-centric multimodal understanding, reasoning, and generating. Based on the dense LLM, we build Uni-MoE-2.0-Omni from scratch through three core contributions: dynamic-capacity Mixture-of-Experts (MoE) design, a progressive training strategy enhanced with an iterative reinforcement strategy, and a carefully curated multimodal data matching technique. It is capable of omnimodal understanding, as well as generating images, text, and speech. Architecturally, our new MoE framework balances computational efficiency and capability for 10 cross-modal inputs using shared, routed, and null experts, while our Omni-Modality 3D RoPE ensures spatio-temporal cross-modality alignment in the self-attention layer. For training, following cross-modal pretraining, we use a progressive supervised fine-tuning strategy that activates modality-specific experts and is enhanced by balanced data composition and an iterative GSPO-DPO method to stabilise RL training and improve reasoning. Data-wise, the base model, trained on approximately 75B tokens of open-source multimodal data, is equipped with special speech and image generation tokens, allowing it to learn these generative tasks by conditioning its outputs on linguistic cues. Extensive evaluation across 85 benchmarks demonstrates that our model achieves SOTA or highly competitive performance against leading OLMs, surpassing Qwen2.5-Omni (trained with 1.2T tokens) on over 50 of 76 benchmarks. Key strengths include video understanding (+7% avg. of 8), omnimodallity understanding (+7% avg. of 4), and audiovisual reasoning (+4%). It also advances long-form speech processing (reducing WER by 4.2%) and leads in low-level image processing and controllable generation across 5 metrics.


Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models

Sheng, Zhichao, Zhou, Shilin, Gong, Chen, Li, Zhenghua

arXiv.org Artificial Intelligence

Large Audio Language Models (LALMs), powered by the chain-of-thought (CoT) paradigm, have shown remarkable reasoning capabilities. Intuitively, different problems often require varying depths of reasoning. While some methods can determine whether to reason for a given problem, they typically lack a fine-grained mechanism to modulate how much to reason. This often results in a ``one-size-fits-all'' reasoning depth, which generates redundant overthinking for simple questions while failing to allocate sufficient thought to complex ones. In this paper, we conduct an in-depth analysis of LALMs and find that an effective and efficient LALM should reason smartly by adapting its reasoning depth to the problem's complexity. To achieve this, we propose a difficulty-adaptive reasoning method for LALMs. Specifically, we propose a reward function that dynamically links reasoning length to the model's perceived problem difficulty. This reward encourages shorter, concise reasoning for easy tasks and more elaborate, in-depth reasoning for complex ones. Extensive experiments demonstrate that our method is both effective and efficient, simultaneously improving task performance and significantly reducing the average reasoning length. Further analysis on reasoning structure paradigm offers valuable insights for future work.


Diff-XYZ: A Benchmark for Evaluating Diff Understanding

Glukhov, Evgeniy, Conti, Michele, Bogomolov, Egor, Golubev, Yaroslav, Bezzubov, Alexander

arXiv.org Artificial Intelligence

Reliable handling of code diffs is central to agents that edit and refactor repositories at scale. We introduce Diff-XYZ, a compact benchmark for code-diff understanding with three supervised tasks: apply (old code $+$ diff $\rightarrow$ new code), anti-apply (new code $-$ diff $\rightarrow$ old code), and diff generation (new code $-$ old code $\rightarrow$ diff). Instances in the benchmark are triples $\langle \textit{old code}, \textit{new code}, \textit{diff} \rangle$ drawn from real commits in CommitPackFT, paired with automatic metrics and a clear evaluation protocol. We use the benchmark to do a focused empirical study of the unified diff format and run a cross-format comparison of different diff representations. Our findings reveal that different formats should be used depending on the use case and model size. For example, representing diffs in search-replace format performs best for larger models across most tasks, while structured udiff variants offer similar but slightly weaker performance. In contrast, smaller open models benefit little from any formatting choice. The Diff-XYZ benchmark is a reusable foundation for assessing and improving diff handling in LLMs that can aid future development of diff formats and models editing code. The dataset is published on HuggingFace Hub: https://huggingface.co/datasets/JetBrains-Research/diff-xyz.